Resolving Person Names in Web People Search

نویسندگان

  • Krisztian Balog
  • Leif Azzopardi
  • Maarten de Rijke
چکیده

Disambiguating person names in a set of documents (such as a set of web pages returned in response to a person name) is a key task for the presentation of results and the automatic profiling of experts. With largely unstructured documents and an unknown number of people with the same name the problem presents many difficulties and challenges. This chapter treats the task of person name disambiguation as a document clustering problem, where it is assumed that the documents represent particular people. This leads to the person cluster hypothesis, which states that similar documents tend to represent the same person. Single Pass Clustering, k-Means Clustering, Agglomerative Clustering and Probabilistic Latent Semantic Analysis are employed and empirically evaluated in this context. On the SemEval 2007 Web People Search it is shown that the person cluster hypothesis holds reasonably well and that the Single Pass Clustering and Agglomerative Clustering methods provide the best performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Vitae and Map Display System for People on the Web

We present a system that displays a curriculum vitae with a map to understand people. Our method is based on the following processes: (1) creating curriculum vitae using related work [1], (2) extracting the names of places where the person studied and worked from the vitae, (3) getting such location information as latitudes, longitudes, and addresses from the place names using Google Maps API, ...

متن کامل

Summarizing and Visualizing Web People Search Results

People search is one major search activity on the Web. If the list of people search results is merely “person 1, person 2, . . . and so on,” users have difficulty determining which person clusters they should select. In this paper, we present a project that summarizes and visualizes Web people search results to help users select person clusters more easily. We explore three ways of summarizing ...

متن کامل

Which Who are They? People Attribute Extraction and Disambiguation in Web Search Results∗

People name search often returns a lot of Web pages containing the strings of personal names. Due to namesake, extracting target person attributes (such as birthday, occupation, affiliation, nationality, contact information, etc.) is expected to be helpful to differentiate documents related to different people and thus group documents related to the same person. This paper presents the methodol...

متن کامل

Assigning Location Information to Display Individuals on a Map for Web People Search Results

Distinguishing people with identical names is becoming more and more important in Web search. This research aims to display person icons on a map to help users select person clusters that are separated into different people from the result of person searches on the Web. We propose a method to assign person clusters with one piece of location information. Our method is comprised of two processes...

متن کامل

UC3M_13: Disambiguation of Person Names Based on the Composition of Simple Bags of Typed Terms

This paper describes a system designed to disambiguate person names in a set of Web pages. In our approach Web documents are represented as different sets of features or terms of different types (bag of words, URLs, names and numbers). We apply Agglomerative Vector Space clustering that uses the similarity between pairs of analogous feature sets. This system achieved a value of 66% for Fα=0.2 a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008